-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support offset in composite aggs #50609
Conversation
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
Pinging @elastic/es-docs (>docs) |
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
Please have a close look at this one! It has wire changes and it has been more than a year since I made any of those.... |
|
||
@Override | ||
public void innerWriteTo(StreamOutput out) throws IOException { | ||
if (out.getVersion().before(Version.V_8_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually serialize this Rounding
but I believe I've got this bit right so I kept it. I'd like to move the other offset
code over to this Rounding which means that we will eventually serialize it. At that point we'll get exhaustive tests for this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo the version check which I discuss below.
@@ -80,12 +91,18 @@ protected DateHistogramValuesSourceBuilder(StreamInput in) throws IOException { | |||
super(in); | |||
dateHistogramInterval = new DateIntervalWrapper(in); | |||
timeZone = in.readOptionalZoneId(); | |||
if (in.getVersion().after(Version.V_8_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the way to do this, but it has been a long time. I want to land this change in master and 7.x, but these tests have no change of passing the bwc tests until I backport it.
Now that I think about it, maybe this version check should be V_7_6_0
actually. Help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests caught an error - this at least be onOrAfter
but I'll switch it to V_7_6_0
while I'm there.
@@ -287,6 +287,72 @@ Time zones may either be specified as an ISO 8601 UTC offset (e.g. `+01:00` or | |||
`-08:00`) or as a timezone id, an identifier used in the TZ database like | |||
`America/Los_Angeles`. | |||
|
|||
*Offset* | |||
|
|||
include::datehistogram-aggregation.asciidoc[tag=offset-explanation] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured because the option is the same I should just include the docs from the normal date histogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Hurray! Integration test failure! Will debug. Good robots, catching things. |
@elasticmachine, run elasticsearch-ci/2 |
The default distro failure is due to the thing not being backported. I'll find a way to get this working. |
I believe I've got this sorted thanks to @DaveCTurner and @cbuescher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments/questions, but I think this looks good. Main question was around the new Rounding class, most of the rest is cosmetic :)
@@ -287,6 +287,72 @@ Time zones may either be specified as an ISO 8601 UTC offset (e.g. `+01:00` or | |||
`-08:00`) or as a timezone id, an identifier used in the TZ database like | |||
`America/Los_Angeles`. | |||
|
|||
*Offset* | |||
|
|||
include::datehistogram-aggregation.asciidoc[tag=offset-explanation] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -556,19 +567,73 @@ public boolean equals(Object obj) { | |||
} | |||
} | |||
|
|||
static class OffsetRounding extends Rounding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intention to migrate the existing users of offsets (regular date histo agg, etc) over to this wrapper? E.g. it seems a bit heavyweight to create a whole new rounding, but if the idea is to followup with changes to the existing users of offset (rather than baking the logic into the aggs) it makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we should probably add a javadoc explaining why/when to use this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did intend to migrate the rest over of the uses of offset
over to it.
I added javadoc on the public interface to the class which is what the rest of the classes in this file do. Do you think I should duplicate it onto this one? Or add something maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed that it was documented on the public method. Since the rest of the class does it that way, fine with me :)
Thanks for the explanation!
|
||
OffsetRounding(StreamInput in) throws IOException { | ||
delegate = Rounding.read(in); | ||
offset = in.readLong(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe read/write ZLong instead? I suspect offsets will be small-to-medium'ish sized and either positive or negative, so zlong might be a win?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places offsets are written as long and I just copied it. But I'd be fine with zlong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed the switch to zlong.
byte id = in.readByte(); | ||
switch (id) { | ||
case TimeUnitRounding.ID: | ||
rounding = new TimeUnitRounding(in); | ||
break; | ||
return new TimeUnitRounding(in); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏 for the cleanup here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure everyone would consider that cleaner, but I sure do!
@@ -195,6 +195,16 @@ public void testTimeUnitRoundingDST() { | |||
assertThat(tzRounding_chg.round(time("2014-11-02T06:01:01", chg)), isDate(time("2014-11-02T06:00:00", chg), chg)); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do you know if we have any serialization tests for Rounding
? I was looking to see if we test that the id()
bytes are correct and don't accidentally change (similar to what we do with AbstractWriteableEnumTestCase
enum tests)... but couldn't find any serialization tests at all.
If we have them somewhere, let's add a test for the id byte. If not, probably too much to add to this PR but we should file a ticket so we don't forget to add some tests... makes me uneasy that such a widespread class doesn't have serialization tests :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we end up testing serialization as part of testing things like extended bounds bucket response. Adding a unit test for just this class's serialization makes sense to me though. I figured I'd wait until I used it in a context where we serialized it but since I'm writing the serialization code now I probably ought to write the test now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a wire test case.
@@ -195,6 +195,16 @@ public void testTimeUnitRoundingDST() { | |||
assertThat(tzRounding_chg.round(time("2014-11-02T06:01:01", chg)), isDate(time("2014-11-02T06:00:00", chg), chg)); | |||
} | |||
|
|||
public void testOffsetRounding() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add another test that does negative offsets too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a test with negative offsets.
The serialization was subtly wrong, particularly in the Eire time zone.
@polyfractal, I've pushed everything you asked for. |
I reworked the serialization a bit, moving the write methods so they are near the read methods and replacing |
@@ -418,24 +434,16 @@ public boolean equals(Object obj) { | |||
return false; | |||
} | |||
TimeUnitRounding other = (TimeUnitRounding) obj; | |||
return Objects.equals(unit, other.unit) && Objects.equals(timeZone, other.timeZone); | |||
return unit == other.unit && timeZone.equals(other.timeZone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revert this change. It snuck in because I thought the comparison was wrong.
|
||
@Override | ||
public String toString() { | ||
return "Rounding[" + interval + " in " + timeZone + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The toString
implementations weren't consistent which made reading the error messages hard. So I changed them.
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
Thanks for the reviews @polyfractal and @jimczi ! |
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes #48757
When deserializing time zones in the Rounding classes we used to include a tiny normalization step via `DateUtils.of(in.readString())` that was lost in elastic#50609. Its at least necessary for some tests, e.g. the cause of elastic#50827 is that when sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read back as a UTC ZoneRegion, which should behave the same but fails the equality tests in our serialization tests. Reverting to the previous behaviour with an additional normalization step on 7.x.
When I implemented elastic#50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
When I implemented #50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
When deserializing time zones in the Rounding classes we used to include a tiny normalization step via `DateUtils.of(in.readString())` that was lost in #50609. Its at least necessary for some tests, e.g. the cause of #50827 is that when sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read back as a UTC ZoneRegion, which should behave the same but fails the equality tests in our serialization tests. Reverting to the previous behaviour with an additional normalization step on 7.x. Co-authored-by: Nik Everett <[email protected]> Closes #50827
We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
…0873) We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
When I implemented elastic#50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
) We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset. This is a redo of elastic#50873 with more integration tests. This reverts commit d114c9d.
…51495) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset. This is a redo of #50873 with more integration tests. This reverts commit d114c9d.
Adds support for the
offset
parameter to thedate_histogram
sourceof composite aggs. The
offset
parameter is supported by the normaldate_histogram
aggregation and is useful for folks that need tomeasure things from, say, 6am one day to 6am the next day.
This is implemented by creating a new
Rounding
that knows how tohandle offsets and delegates to other rounding implementations. That
implementation doesn't fully implement the
Rounding
contract, namelynextRoundingValue
. That method isn't used by composite aggs so I can'tbe sure that any implementation that I add will be correct. I propose to
leave it throwing
UnsupportedOperationException
until I need it.Closes #48757